208 research outputs found
Deep Learning Techniques for Music Generation -- A Survey
This paper is a survey and an analysis of different ways of using deep
learning (deep artificial neural networks) to generate musical content. We
propose a methodology based on five dimensions for our analysis:
Objective - What musical content is to be generated? Examples are: melody,
polyphony, accompaniment or counterpoint. - For what destination and for what
use? To be performed by a human(s) (in the case of a musical score), or by a
machine (in the case of an audio file).
Representation - What are the concepts to be manipulated? Examples are:
waveform, spectrogram, note, chord, meter and beat. - What format is to be
used? Examples are: MIDI, piano roll or text. - How will the representation be
encoded? Examples are: scalar, one-hot or many-hot.
Architecture - What type(s) of deep neural network is (are) to be used?
Examples are: feedforward network, recurrent network, autoencoder or generative
adversarial networks.
Challenge - What are the limitations and open challenges? Examples are:
variability, interactivity and creativity.
Strategy - How do we model and control the process of generation? Examples
are: single-step feedforward, iterative feedforward, sampling or input
manipulation.
For each dimension, we conduct a comparative analysis of various models and
techniques and we propose some tentative multidimensional typology. This
typology is bottom-up, based on the analysis of many existing deep-learning
based systems for music generation selected from the relevant literature. These
systems are described and are used to exemplify the various choices of
objective, representation, architecture, challenge and strategy. The last
section includes some discussion and some prospects.Comment: 209 pages. This paper is a simplified version of the book: J.-P.
Briot, G. Hadjeres and F.-D. Pachet, Deep Learning Techniques for Music
Generation, Computational Synthesis and Creative Systems, Springer, 201
Byte Pair Encoding for Symbolic Music
When used with deep learning, the symbolic music modality is often coupled
with language model architectures. To do so, the music needs to be tokenized,
i.e. converted into a sequence of discrete tokens. This can be achieved by
different approaches, as music can be composed of simultaneous tracks, of
simultaneous notes with several attributes. Until now, the proposed
tokenizations rely on small vocabularies of tokens describing the note
attributes and time events, resulting in fairly long token sequences, and a
sub-optimal use of the embedding space of language models. Recent research has
put efforts on reducing the overall sequence length by merging embeddings or
combining tokens. In this paper, we show that Byte Pair Encoding, a compression
technique widely used for natural language, significantly decreases the
sequence length while increasing the vocabulary size. By doing so, we leverage
the embedding capabilities of such models with more expressive tokens,
resulting in both better results and faster inference in generation and
classification tasks. The source code is shared on Github, along with a
companion website. Finally, BPE is directly implemented in MidiTok, allowing
the reader to easily benefit from this method.Comment: EMNLP 2023, source code: https://github.com/Natooz/BPE-Symbolic-Musi
An Approach to Operationalize Regulative Norms in Multiagent Systems
International audienc
Collective management of environmental commons with multiple usages: a guaranteed viability approach
In this paper we address the collective management of environmental commons
with multiple usages in the framework of the mathematical viability theory. We
consider that the stakeholders can derive from the study of their own
socioeconomic problem the variables describing their different usages of the
commons and its evolution, and a representation of the desirable states for the
commons. We then consider the guaranteed viability kernel, subset of the set of
desirable states where it is possible to maintain the state of the commons even
when its evolution is represented by several conflicting models. This approach
is illustrated on a problem of lake eutrophication.Comment: 22 pages, 4 figure
miditok: A Python package for MIDI file tokenization
Recent progress in natural language processing has been adapted to the
symbolic music modality. Language models, such as Transformers, have been used
with symbolic music for a variety of tasks among which music generation,
modeling or transcription, with state-of-the-art performances. These models are
beginning to be used in production products. To encode and decode music for the
backbone model, they need to rely on tokenizers, whose role is to serialize
music into sequences of distinct elements called tokens. MidiTok is an
open-source library allowing to tokenize symbolic music with great flexibility
and extended features. It features the most popular music tokenizations, under
a unified API. It is made to be easily used and extensible for everyone.Comment: Updated and comprehensive report. Original ISMIR 2021 document at
https://archives.ismir.net/ismir2021/latebreaking/000005.pd
Refinement operators to facilitate the reuse of interaction laws in open multi-agent systems
ABSTRACT As new software demands and requirements appear, the system and its interaction laws must evolve to support those changes. Languages and models should provide the tools for dealing with this evolution. Poor support on evolution has a negative impact on system maintainability. In this paper, we propose some refinement operators to extend the interaction laws in open multi-agent systems. As an example of this idea, we implemented a customizable application in the supply chain management domain as an open system environmen
- …